当代预测模型很难解释,因为他们的深网利用了输入要素之间的许多复杂关系。这项工作通过测量相关特征对网络相对于输入的功能熵的贡献,提出了模型可解释性的理论框架。我们依赖于对数 - 索波列夫的不等式,该不平等是通过功能性渔民信息与数据的协方差界定功能熵的。这提供了一种衡量特征子集对决策功能的信息贡献的原则方法。通过广泛的实验,我们表明我们的方法超过了基于图像,文本和音频等各种数据信号的现有基于基于可解释性抽样的方法。
translated by 谷歌翻译
将机器学习模型整合在医学中的关键问题是解释其推理的能力。流行的解释性方法表明了自然图像识别的令人满意的结果,但是在医学图像分析中,其中许多方法提供了部分和嘈杂的解释。最近,注意机制在预测性能和可解释的质量方面都表现出了令人信服的结果。关注的基本特征是,它利用输入的显着部分,这有助于模型的预测。为此,我们的工作着重于注意力分布的解释价值。我们提出了一种多层注意机制,该机制可以使用凸优化在卷积层之间实施一致的解释。我们应用二元性来分解层之间的一致性约束,通过重新聚集其注意力概率分布。我们进一步建议通过优化我们的目标来学习双重见证。因此,我们的实施使用标准的背部传播,因此非常有效。在保留预测性能的同时,我们提出的方法利用了弱注释的医学成像数据,并为模型的预测提供了完整而忠实的解释。
translated by 谷歌翻译
深度神经网络的成功严重依赖于他们在其投入和其产出之间编码复杂关系的能力。虽然此属性适用于培训数据,但它也掩盖了驱动预测的机制。本研究旨在通过采用基于离散变分的自动化器来改变预测类的干预机制来揭示隐藏的概念。然后,解释模型从任何隐藏层和相应的介入表示可视化编码信息。通过评估原始代表与介入代表之间的差异,可以确定可以改变该类的概念,从而提供可解释性。我们展示了我们在Celeba上的方法的有效性,在那里我们对数据中的偏见显示了各种可视化,并建议揭示和改变偏见的不同干预措施。
translated by 谷歌翻译
为了在结构因果模型(SCM)中执行反事实推理,需要了解因果机制,它提供条件分布的因子,并将噪声映射到样本的确定性函数。遗憾的是,因象无法通过观察和与世界互动收集的数据唯一确定的因果机制,因此仍然存在如何选择因果机制的问题。最近的工作中,Oberst&Sontag(2019)提出了Gumbel-Max SCM,它由于直观上吸引的反事实稳定性而导致Gumbel-Max Reparameterizations作为因果机制。在这项工作中,我们认为选择在估算反事实治疗效果时最小化的定量标准的因果机制,例如最小化方差。我们提出了一个参数化的因果机制,概括了Gumbel-Max。我们表明他们可以接受培训,以最大限度地减少对感兴趣查询的分布的反事实效果方差和其他损失,从而产生比固定替代方案的反复治疗效果的较低方差估计,也推广到在培训时间未见的查询。
translated by 谷歌翻译
Magnetic Resonance Fingerprinting (MRF) is an efficient quantitative MRI technique that can extract important tissue and system parameters such as T1, T2, B0, and B1 from a single scan. This property also makes it attractive for retrospectively synthesizing contrast-weighted images. In general, contrast-weighted images like T1-weighted, T2-weighted, etc., can be synthesized directly from parameter maps through spin-dynamics simulation (i.e., Bloch or Extended Phase Graph models). However, these approaches often exhibit artifacts due to imperfections in the mapping, the sequence modeling, and the data acquisition. Here we propose a supervised learning-based method that directly synthesizes contrast-weighted images from the MRF data without going through the quantitative mapping and spin-dynamics simulation. To implement our direct contrast synthesis (DCS) method, we deploy a conditional Generative Adversarial Network (GAN) framework and propose a multi-branch U-Net as the generator. The input MRF data are used to directly synthesize T1-weighted, T2-weighted, and fluid-attenuated inversion recovery (FLAIR) images through supervised training on paired MRF and target spin echo-based contrast-weighted scans. In-vivo experiments demonstrate excellent image quality compared to simulation-based contrast synthesis and previous DCS methods, both visually as well as by quantitative metrics. We also demonstrate cases where our trained model is able to mitigate in-flow and spiral off-resonance artifacts that are typically seen in MRF reconstructions and thus more faithfully represent conventional spin echo-based contrast-weighted images.
translated by 谷歌翻译
In the framework of online convex optimization, most iterative algorithms require the computation of projections onto convex sets, which can be computationally expensive. To tackle this problem HK12 proposed the study of projection-free methods that replace projections with less expensive computations. The most common approach is based on the Frank-Wolfe method, that uses linear optimization computation in lieu of projections. Recent work by GK22 gave sublinear adaptive regret guarantees with projection free algorithms based on the Frank Wolfe approach. In this work we give projection-free algorithms that are based on a different technique, inspired by Mhammedi22, that replaces projections by set-membership computations. We propose a simple lazy gradient-based algorithm with a Minkowski regularization that attains near-optimal adaptive regret bounds. For general convex loss functions we improve previous adaptive regret bounds from $O(T^{3/4})$ to $O(\sqrt{T})$, and further to tight interval dependent bound $\tilde{O}(\sqrt{I})$ where $I$ denotes the interval length. For strongly convex functions we obtain the first poly-logarithmic adaptive regret bounds using a projection-free algorithm.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
在矩阵完成问题中,人们希望根据一组(可能是嘈杂的)条目重建一个低级别矩阵。先前的工作考虑完成整个矩阵,在条目分布不均匀的情况下,这可能是高度不准确的。我们正式化了部分矩阵完成的问题,目标是完成大量条目,或等效地完成整个矩阵并指定条目的准确子集。有趣的是,即使分布未知且任意复杂,我们的有效算法也能够保证:(a)在所有完成的条目上高精度,以及(b)高覆盖范围,这意味着它至少涵盖了与该矩阵的范围一样多。观察的分布。
translated by 谷歌翻译
小组同步问题涉及从其成对比率噪声测量中估算组元素的收集。此任务是许多计算问题中的关键组成部分,包括单粒子冷冻电子显微镜(Cryo-EM)中的分子重建问题。估计组元素的标准方法基于迭代应用线性和非线性操作员。受到与深神经网络的结构相似性的激励,我们采用了算法展开的概念,其中训练数据用于优化算法。我们为多种组同步实例设计了展开的算法,包括3-D旋转组的同步:Cryo-EM中的同步问题。我们还将类似的方法应用于多参考对准问题。我们通过数值实验表明,展开策略在各种情况下都优于现有的同步算法。
translated by 谷歌翻译
我们通过靶向位置突出的精英(PSO-TPME)大大提高了粒子群优化(PSO)的收敛速度和全局勘探能力。这三个关键创新介绍了认知和社会模型中的粒子分类,精英和突变。PSO-TPME针对五个流行的PSO变体进行了基准测试,用于多维函数,这些函数在优化领域中广泛采用,特别是收敛的准确性,收敛速度和发现全局最小值的能力。统计误差通过许多重复评估。该模拟表明,提出的PSO变体在收敛速率和精度按数量级优于其他变体。
translated by 谷歌翻译